Transcription of New Speaking Styles - Voicemail
نویسندگان
چکیده
In this paper we describe a new testbed for developing speech recognition algorithms a VoiceMail transcription task, analogous to other tasks such as the Switchboard, CallHome [1] and the Hub 4 tasks [2] which are currently used by speech recognition researchers. Spontaneous speech occurring in day-today life can broadly be classi ed into two categories (i) where the speaker does not receive any external feedback to direct his/her speech, and (ii) where the speaker receives external feedback from another person/machine/audience. Examples of the former category are radio broadcast news, voicemail etc., and examples of the latter category are telephone conversations, natural language transaction systems (eg. ATIS), seminars, etc. In general to obtain the best performance in transcribing a certain style of speech, it is necessary to train the speech recognition system on similar style of training data. Some of the speech categories mentioned above are quite well represented in currently existing databases. However, voicemail data is not well represented in any database, even though it represents a very large volume of real-world speech data. Consequently there is a need for a Voicemail database in order to improve transcription performance on a voicemail transcription task, and also to establish a new test bed for speech recognition algorithms. Similar to the Switchboard/CallHome databases, the Voicemail database comprises telephone bandwidth spontaneous speech. However the di erence with respect to the Switchboard and CallHome tasks is that the interaction is not between two humans, but rather between a human and a machine. Consequently, the speech is expected to be a little more formal in its nature, without the problems of cross-talk, barge-in etc. This eliminates some of the variables and provides more controlled conditions enabling one to concentrate on the aspects of spontaneous speech and e ects of the telephone channel. In this paper, we will describe the modality of collection of the speech data, and some algorithmic techniques that were devised based on this data. We will also describe the initial results of transcription performance on this task.
منابع مشابه
Speech recognition performance on a new voicemail transcription task
In this paper we describe a new testbed for developing speech recognition algorithms a VoiceMail transcription task, analogous to other tasks such as the Switchboard, CallHome, and the Hub 4 tasks, which are currently used by speech recognition researchers. We describe the collection and use of a new VoiceMail database (that is available to the research community through the LDC), and also desc...
متن کاملRecent improvements in speech recognition performance on large vocabulary conversational speech (voicemail and switchboard)
In this paper we report recent improvements in word error performance on a voicemail transcription task. Last year, the speaker independent word error rate (WER) on the dev test set of the Voicemail Transcription task was reported at 35.45% [1]. This year, we report a relative 20% gain over this number. The improvements were obtained using several new algorithms and an increased amount of train...
متن کاملPerformance Improvements in Voicemail Transcription
In this paper we report recent improvements in word error performance on a voicemail transcription task. Last year, the speaker independent word error rate (WER) on the dev test set of the Voicemail Transcription task was reported at 35.45% [1]. This year, we report a relative 20% gain over this number. The improvements were obtained using several new algorithms and an increased amount of train...
متن کاملPorting: SwitchBoard to the VoiceMail task
This paper examines techniques that allow a well-trained source system built on one task to be rapidly adapted, or ported, to another target task. The two tasks considered in this paper are Hub5, or Switchboard, as the source system and VoiceMail as the target task. The two tasks are acoustically similar, both being telephonebandwidth speech tasks, but differ in speaking style. SwitchBoard is c...
متن کاملRecent improvements in voicemail transcription
In this paper we report recent improvements in voicemail transcription. The voicemail transcription task was introduced last year [1] as representing a style of conversational telephone speech that is somewhat different from the Switchboard and CallHome [2] databases. Last year, the speaker independent and speaker adapted word error rates (WER) on this task were reported at 41.94% and 38.18% re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997